Overview of operations

Import Dataset
Add Fields
Remove Fields
Modify Dataset
Transform Dataset
Filter Dataset
Combine Dataset
Export Dataset

Category: Import Dataset

Operation name	Operation function
Random Values	Generates a dataset with a defined number and type of fields, filled with random values.
Text File	Import data as strings from a text file or a folder containing text files, where values are delimited.
Database Table	Import a dataset by connecting to a OLE DB or ODBC provider data source and selecting the database where the tables are listed. You can select to either generate a query which will enable you to import all the fields and rows selected, or enter your own query, allowing control over what columns and/or rows are imported.
Wave Generator	Generate a dataset with values that from the shape of four different types of waves: Sine, Square, Triangle, or Saw waves. There can be more than one wave of the same type, or a combination of different types of waves. Each wave represents a generated field.
Proficy Historian	Import data from an online Proficy Historian data source.
Proficy Historian Alarm and Event	Import Alarm and Event data from an online Proficy Historian data source.
Source Block	Import data from a source block, containing either historical or real-time data. Data be either discrete or continuous. Configure the selected source block as required, e.g. by adding a timestamp field for continuous sources or selecting an index for discrete sources.
.NET Wrapper Source	External .NET source objects containing custom functionality can be integrated using the .Net Wrapper operation. The .NET Wrapper enables the reuse of prior coding, and enables continuity between a number of applications. Use this operation to load a .NET source object from a .NET class library assembly file (.dll file). The selected source object will determine how your data will be generated. Configure either a continuous or discrete source.
Control Loop Data	Import cached control loop data from deployed CSense Process+ control loops.

Category: Add Fields

Operation name	Operation function
.NET Expression	Add one or more fields to the existing dataset by writing an expression in either C# (C Sharp) or VB.NET (Visual Basic .NET) languages. The values in the new field will be determined by the programmed expression. These values can be either a valid value or a null entry, and can be of any data type. ??
.NET Script	Add one or more fields to the existing dataset by writing a script in either C# (C Sharp) or VB.NET (Visual Basic .NET) languages. The values in the new field will be determined by the programmed expression. These values can be either a valid value or a null entry, and can be of any data type. ??
Shift Values	Create an offset and shift the data values in a specific field. An offset will shift the values of the selected field up or down the row position. In the rows where an offset was created, the value of the previous data positions will be null. Offsets are added one field at a time, and multiple offsets for that field can be added, resulting in more than one offset field being added to the dataset.
Classify	Assign classes to a range of data within one specified field. This is useful for quick classification and interpretation of the data values.
Copy	Create a copy of the selected fields and add these new copied fields with generated field names to the existing dataset.
Concatenate	Concatenation is the joining of two items to form a new item, which is a combination of both the part items. Use this operation to concatenate different field values into a new string value, within a newly created field. The resulting modified dataset will get an extra field containing the concatenated values. Other existing fields remain unaffected.
SQL Expression	Enter multiple SQL expressions that will be executed to create new fields in the dataset. The resulting dataset will contain all the original fields, as well as the new values generated from the SQL expression queries.
Rate of Change	Measure the rate of change between data points in the input fields. The rate of change is calculated from the difference between the current value and the previous value of the field, divided by the period between the points.

Category: Remove

Operation name	Operation function
Delete Fields	Delete selected fields within a dataset.

Category: Modify Dataset

Operation name	Operation function
Clip Values	Clipping refers to changing the values of selected fields if these values fall beyond defined limits for the fields. Values outside of the limits can either be set as bad quality or to take on the lower or upper limit value defined.
Replace Empty Values	Replace all the empty values of a selected field with the previous value, next value, or a selected statistical value. The selected fill option will be applied to all the selected fields.
Replace Partial Strings	Replace all string values in a specified field that matches a search pattern. Specific text or wildcard characters may be included in the search pattern and all values matching this will be replaced with a new string value.
Scale Values	Scale the data values in selected fields by multiplying the data by a specified number. One use of this operation is a quick method of converting between measurements, provided the conversions are linear.
Replace Strings	Rename string values in string fields, one field at a time. This operation is useful for classification, when you want to replace a classifying string name.
SQL Update	Set field values by writing a SQL update SET query and define a WHERE update query if required. The WHERE query will specify the exact areas in the dataset on which to apply the SET query. This operation will modify the existing dataset by changing the selected field values to that specified in the SET query.
Detect Inactivity	Detect fields that have become inactive. Inactive fields are identified when the values don???t change for a specified period - this period can be either timestamp or row based. This operation will then mark the values that fall outside the defined period as bad quality and will assign an empty value.
Interpolate Empty Values	Interpolation is used to calculate an unknown value between two known values. It is used to construct new data points or replace empty values within a range of a discrete set of known data points. Interpolation in this operation is linear, and can be used to interpolate integer, double and timestamp values.
Time Manipulation	Change the times of the timestamp field by either offsetting or warping the timestamp. Offsetting is adjusting the timestamp by a set period of days, hours, seconds and fractions of seconds. Warping manipulates the time enabling defining a different start and end time; the timestamps are changed to fit between the newly selected start and end time. The number of rows will never change and the chronological order of the data remains constant, only the timestamps will be modified in the current dataset.
Rename Fields	Rename the fields of the dataset.

Category: Transform Datasets

Operation name	Operation function
Aggregate	Perform mathematical operations such as finding the average value, the standard deviation and row count, on a group of numbers. Integer or string fields can be selected for grouping, and the required aggregate operation is performed on these grouped fields.
Copy	Create an exact copy of the dataset.
Reverse	Reverse the order of the values of the dataset. This operation requires no configuration, by clicking on the operation, a new dataset is created, with the order of the values reversed. ??
Convert Type	Change the data types of the fields in the dataset. Valid data types are integer, double, string and date/time. The fields not converted can either still be included in the resulting dataset, or ignored and excluded from the resulting dataset.
Sort	Prioritize the dataset fields, and then sort the values in the rows of the highest priority field in an ascending or descending order. The corresponding rows of the other dataset fields are moved according to the sorting criteria of the highest priority field. Where there are rows containing identical values in the highest priority field, the sorting criteria of the second highest priority field is considered. It is therefore important to ensure that the fields of greatest priority are listed highest of the selected fields, and that the ascending and descending order requested is stipulated for each field.
Transpose	Transpose data for summarization - it produces meaningful information from a table of information. While the structure of the original dataset is not changed in any way, transposing using pivot tables can automatically sort, count, and sum the data stored in one table and create a second table (called the "pivot table") to present the summarized data. The pivot table presents several kinds of aggregations including: sum, average, standard deviation, and count for example.
Cross correlation	Automatically calculate the lag required for the highest correlation between a change in a field value and the effect seen on the process target field. The lag is calculated either over time or over the row indexes.
Correlation	Create a correlation matrix of selected fields. The correlation matrix can be created using either an index or the timestamp as a base for the correlation calculation. Correlation is represented as a number, indicating the strength of a linear relationship between two random variables. This operation will create a new dataset, containing the correlation matrix.
Delay	Create a dataset with delayed data values for selected fields. These delays can be implemented on either the row number or as a number of seconds . Multiple fields can be delayed, each by different values.
ISV Action Object	Reuse the functionality of an existing data manipulation blueprint created within the Architect. Use this operation to write new data as the source for the Action Object blueprint, by mapping the new data fields to those required within the Action Object. The functionality of each block will then be applied to the new data. The blueprint outputs will no longer be sent to a sink block within the Architect, rather they will be saved as a new dataset within the s. ??This operation will either use the timestamp field of the dataset, or will create a timestamp field for the Action Object. Bad quality fields will be recorded in the Action Object with empty values. Note that in order for this operation to work, all the blocks used to create the Action Object blueprint within the Architect need to be licensed and registered to you. If they are not registered the operation will fail.
Join Timestamps	Merge two or more timestamp fields into one timestamp field, with all timestamps listed chronologically. The data values of other fields are still listed at their original timestamps, and can either be repeated for the additional timestamps in the joined timestamp field, or be interpolated to generate values for the additional timestamps. This operation creates a new dataset, with new fields. These new fields contain both original data values as well as repeated or interpolated values corresponding to the additional timestamps. Original fields need to be mapped to the selected timestamp fields during configuration.
Moving Statistics	Calculate statistics of the values of each field over a specified window. This window will move across rows of data, with the statistics being calculated either over a number of rows or over a specified time period.
Resample	Resampling will create a new dataset with new start and end times, and a different sampling period. The values of the dataset will not be changed.
Statistics	Calculate statistics for selected fields. ??Statistics can be calculated across data rows or over a time span, where the average between two consecutive timestamps is used. This operation will not affect the original dataset in any way. A new dataset will be created listing the statistics for the selected fields.

Category: Filter Datasets

Operation name	Operation function
Limit Values	Filter out values in your data by applying upper and/or lower limits. These limits can be different for each selected field.
Empty Values	Filter out the empty values from the selected fields, and create a new dataset where the rows that previously contained the empty values have been removed.
Timestamps	Filter out specific fields based on their timestamps by applying an upper and/or lower limit. This will create a new dataset that contains a field of only timestamps that fall within the limits specified.
SQL Expression	Filter selected fields by defining a WHERE clause in the SQL expression, indicating the conditions that need to be met in order for the values to be included in the new dataset. Only the fields selected will be listed in the new dataset.

Category: Combine Datasets

Operation name	Operation function
Horizontal	Combine two datasets side by side. Choose the row numbers at which each dataset will align when merged. There is an option to only merge rows where both datasets have values at the same row number. This eliminates rows which only contain data from one dataset. This operation is not dependent on the timestamp field in any way. These timestamp fields are treated as any other field, and will be listed next to each other in the new dataset. The timestamp fields will not influence the order of the values of the other fields in any way.
Relational	Merge two datasets at the points where the values of selected fields in one dataset are the same as the values of selected fields in the second dataset. Then filter how the resulting dataset will be presented by choosing to include all the merged fields, only fields from both datasets with rows sharing common values, only fields from both datasets where rows sharing common values are excluded, or unmatched rows from either both datasets, only the first dataset or only the second dataset. This enables creating a combined dataset, listing only the required relevant data.
Timestamp Range Merge	Merge two datasets, combining only the data values that fall within a defined timestamp range. The new dataset will contain the timestamps that fall within the specified timestamp range and the corresponding data values of the merged datasets.
Vertical	Merge two datasets one below the other. A new dataset will be created with the left hand dataset at the top of the newly created dataset, and the right hand dataset underneath this. Select the fields from the two datasets to be included in the new dataset, then map the selected fields from each dataset to fields of the same type. This operation is not dependent on the timestamp field, and will not merge the timestamp fields into one chronological sequence. The resulting timestamp field will thus contain the timestamps of the first dataset, followed by the timestamps of the second dataset.
Time	Merge two datasets, retaining only selected fields from each dataset. The new dataset will contain only one timestamp field, which is a combination of the timestamps of the two different datasets. Only one timestamp will be used when the two datasets have overlapping timestamps and both values of each original dataset will be listed. Where there are timestamps that did not exist in the original dataset, the last known value is used, or it is left empty if there is no prior value. Select to ignore empty timestamps in the new dataset, or select for the operation to fail. This will ensure that you have a dataset where there are no empty values in the timestamp field.

Category: Export Datasets

Operation name	Operation function
Text File	Select the dataset to export, and define the file name, file location and delimiter to use. Save the exported file as a .csv or .txt file.
Sink Block	Export any dataset to a sink block such as a text sink, Optimised Database-, and Tabular Database sink. This sink block transfers data to a source block for use in other applications. Select which fields will be exported and configure timestamps and data qualities for these fields.
.NET Wrapper	Sink a dataset to a .NET object that has been created in a separate application, prior to using this operation. This sink object will be a .NET assembly library file (dll). Using the .NET object will allow for reuse and ??interchangeability of code and datasets between different programmes. The .NET object previously created will determine in what format the dataset is exported, and will determine how the dataset needs to be configured in order to be utilized by the .NET object. ??

Technical comparison of operations

Category	Operation Name	Effect on dataset	Requires dataset?	Datasets required	Apply to a locked dataset?	Can reconfigure?	Knowledge required	Potential time required	Description
Import Dataset	.NET Wrapper Source	Import	No	0	n/a	Yes	Software	Low	Import from a .NET Wrapper Source
Import Dataset	Database Table	Import	No	0	n/a	Yes	None	High	Import table data
Import Dataset	Proficy Historian	Import	No	0	n/a	Yes	None	High	Import from a Proficy Source
Import Dataset	Proficy Historian A&E	Import	No	0	n/a	Yes	None	High	Import Alarms and Events from a Proficy Source
Import Dataset	Random Values	Import	No	0	n/a	Yes	None	High	Generate random Data
Import Dataset	Source Block	Import	No	0	n/a	Yes	Software	High	Architect Source block
Import Dataset	Text File	Import	No	0	n/a	Yes	None	High	Imports data from a delimited text file
Import Dataset	Wave Generator	Import	No	0	n/a	Yes	None	High	Generates different types of waves
Add Fields	.NET Expression	Modify	Yes	1	No	Yes	.NET	High	Create a new field from a .NET expression
Add Fields	.NET Script	Modify	Yes	1	No	Yes	.NET	High	Create new fields from a C# or VB.NET script
Add Fields	Classify	Modify	Yes	1	No	No	None	Medium	Assigns classes over certain ranges within fields
Add Fields	Concatenate	Modify	Yes	1	No	No	None	Low	String concatenation
Add Fields	Copy	Modify	Yes	1	No	No	None	Low	Copy fields
Add Fields	Rate Of Change	Modify	Yes	1	No	No	Software	High	Rate of change
Add Fields	Shift Values	Modify	Yes	1	No	No	None	Low	Create offset fields
Add Fields	SQL Expression	Modify	Yes	1	No	No	SQL	Low	Create your own SQL expressions
Remove	Delete Fields	Modify	Yes	1	No	No	None	Low	Delete selected fields
Modify Dataset	Clip Values	Modify	Yes	1	No	Yes	.NET	Low	Clipping
Modify Dataset	Detect Inactivity	Modify	Yes	1	No	No	Software	High	Detect inactivity
Modify Dataset	Interpolate Empty Values	Modify	Yes	1	No	No	Software	High	Interpolation
Modify Dataset	Rename Fields	Modify	Yes	1	No	Yes	None	Low	Rename fields
Modify Dataset	Replace Empty Values	Modify	Yes	1	No	Yes	None	Low	Fills the null values with the previous or next value
Modify Dataset	Replace Partial Strings	Modify	Yes	1	No	No	None	Low	Find and replace Strings
Modify Dataset	Replace Strings	Modify	Yes	1	No	No	None	Low	Replace String fields
Modify Dataset	Scale Values	Modify	Yes	1	No	No	None	Low	Scale fields
Modify Dataset	SQL Update	Modify	Yes	1	No	No	SQL	Low	Update values using SQL code
Modify Dataset	Time Manipulation	Modify	Yes	1	No	No	Software	High	Time manipulation
Transform Dataset	Aggregate	Create	Yes	1	Yes	No	None	Low	Aggregate fields
Transform Dataset	Convert Types	Create	Yes	1	Yes	No	None	High	Converts from one data type to another data type
Transform Dataset	Copy	Create	Yes	1	Yes	n/a	None	Low	Copy the dataset
Transform Dataset	Correlation	Create	Yes	1	Yes	No	Software	High	Correlation
Transform Dataset	Cross Correlation	Create	Yes	1	Yes	No	Software	Medium	Auto lag fields
Transform Dataset	Delay	Create	Yes	1	Yes	No	Software	High	Delay
Transform Dataset	ISV Action Object	Create	Yes	1	Yes	No	Software	High	ISV Operation
Transform Dataset	Join Timestamps	Create	Yes	1	Yes	No	Software	High	Join timestamp fields
Transform Dataset	Moving Statistics	Create	Yes	1	Yes	Yes	Software	Low	Calculates moving statistics for the selected fields
Transform Dataset	Resample	Create	Yes	1	Yes	No	Software	High	Resample
Transform Dataset	Reverse	Create	Yes	1	Yes	n/a	None	Low	Reverses the order of the values in the dataset
Transform Dataset	Sort	Create	Yes	1	Yes	No	None	Low	Sort fields in ascending or descending order
Transform Dataset	Statistics	Create	Yes	1	Yes	No	Software	Low	Calculates statistics for the selected fields
Transform Dataset	Transpose	Create	Yes	1	Yes	No	None	Low	Transpose fields
Filter Dataset	Empty Values	Create	Yes	1	Yes	No	None	Low	Filter null fields
Filter Dataset	Limit Values	Create	Yes	1	Yes	Yes	None	Low	Filter fields within user defined limits
Filter Dataset	SQL Expression	Create	Yes	1	Yes	No	SQL	Low	Filter values using SQL code
Filter Dataset	Timestamps	Create	Yes	1	Yes	No	None	Low	Filter timestamps within user defined limits
Combine Dataset	Horizontal	Create	Yes	2	Yes	No	None	Low	Merge tables horizontally on row numbers
Combine Dataset	Relational	Create	Yes	2	Yes	No	None	Medium	Merge tables on chosen fields
Combine Dataset	Time	Create	Yes	2	Yes	No	Software	High	Time merge
Combine Dataset	Timestamp Range Merge	Create	Yes	2	Yes	No	Software	High	Data merge within a timestamp range
Combine Dataset	Vertical	Create	Yes	2	Yes	No	None	Low	Merge tables vertically
Export Dataset	.NET Wrapper	Export	Yes	1	Yes	Yes	Software	High	Export to a .NET Wrapper Sink
Export Dataset	Sink Block	Export	Yes	1	Yes	Yes	Software	High	Export to a Architect Sink block
Export Dataset	Text File	Export	Yes	1	Yes	Yes	None	Medium	Export the dataset to text file

Icons

Icons within Data Preparation are made up of a combination of icons showing the following:

Category.
Knowledge required.
Time required.

Categories	Icon	Representation
Import		Import data from a variety of different sources
Add		Add fields to the dataset.
Remove		Remove fields from the dataset.
Modify		Modify the current dataset.
Transform		Transform the structure of the dataset.
Filter		Filter data according to stipulated criteria.
Combine		Combine datasets in different ways.
Export		Export datasets.
Icon Overlays
Locked dataset		No fields can be added to the existing dataset, nor can the current dataset be modified in any way.
Analyzed dataset		Dataset is loaded for analysis.
Three red chevrons		These operations are at a high risk of being time and resource intensive, depending on the volume of data and user-specified configuration. ??
Two amber chevrons ??		These operations are at a medium risk of being time and resource intensive, depending on the volume of data and user-specified configuration.
One green chevron		These operations are at a low risk of being time and resource intensive, depending on the volume of data and user-specified configuration.
SQL	?? ??	Knowledge of SQL may be required to configure the operation.
.NET		Knowledge of .NET programming languages such as C# or VB.NET may be required to configure the operation.